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Abstract 



Variational sparsity regularization based on £ -norms and other nonlinear func- 
tional has gained enormous attention recently, both with respect to its applications 
and its mathematical analysis. A focus in regularization theory has been to develop 
error estimation in terms of regularization parameter and noise strength. For this 
sake specific error measures such as Bregman distances and specific conditions on 
the solution such as source conditions or variational inequalities have been developed 
and used. 

In this paper we provide, for a certain class of ill-posed linear operator equations, 
a convergence analysis that works for solutions that are not completely sparse, but 
have a fast decaying nonzero part. This case is not covered by standard source 
conditions, but surprisingly can be treated with an appropriate variational inequality. 
As a consequence the paper also provides the first examples where the variational 
inequality approach, which was often believed to be equivalent to appropriate source 
conditions, can indeed go farther than the latter. 
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1 Introduction 

Variational problems of the form 
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have become an important and standard tool in the regularization of operator equations 
Ax = y. In the field of compressed sensing, with a usually not injective but quite well- 
conditioned finite-dimensional operator A the sparsity modeled via the ^-minimization 
yields the appropriate prior knowledge to uniquely restore the desired solution. In inverse 
problems, with usually an infinite-dimensional compact operator A, the sparsity prior 
allows for stable reconstructions even in presence of noise. There is a comprehensive 
literature concerning the ^-regularization of ill-posed problems under sparsity constraints 
including assertions on convergence rates (cf. e.g. [7j and [21 El HI [12j HH [151 EU E21 12H 
[251 [231126]). 

A natural question arising in problems of this type is the asymptotic analysis of such 
variational problems as a — > respectively y 6 — > y, where y are the data that would 
be produced by an exact solution of the problem. While it is straight-forward to show 
convergence of subsequences in weak topologies, quantitative error estimates are much 
more involved. For a long time it was even an open question what are the right conditions 
and error measures to perform such an analysis. In the last years, following [5], the use 
of Bregman distances has been evolved as a standard tool. First estimates were based 
on source conditions of the form (cf. [6] for the ^ 1 -case and (22] for further analysis in a 
compressed sensing framework) 

3w: A*w E d\\x^\\ eHNh (1.2) 

where x^ is the exact solution. Extensions to approximate source conditions (cf. [T7]) and 
a different, but seemingly equivalent approach based on variational inequalities (cf. [9]) 
have been developed subsequently. In the case of ^-regularization it has been shown that 
source conditions are directly related to sparsity, hence error estimates have been derived 
with constants depending on the sparsity level. 

However, one can also consider solutions which are only merely sparse, i.e. few of the 
components being large and the majority being small and decaying fast to zero. Such 
a model is actually more realistic in many cases, e.g. when applying wavelets to audio 
signals or natural images. It is the basis of compression algorithms that most of the 
coefficients are very small and can be ignored, i.e. set to zero. In inversion methods, this 
type of a-priori information can be analyzed for ^-regularization with two perspectives. 
The first is to assume that the relevant solution is indeed sparse, i.e. we are interested in 
a sparse approximation x* to x*. In such a case on should clearly analyze a systematic 
error A(x^ — x^) in addition to the usual error, which is however not straightforward under 
general assumptions. The second approach, which we will adopt in this paper, is to really 
analyze the error in approximating x' using a natural separation into the (few) large and 
the remaining small entries. 

The further goal of the paper is twofold. On the one hand, we are going to derive 
novel convergence rates for Tikhonov regularized solutions of linear ill-posed operator 
equations, where the penalty functional is the £ 1 -norm. We will prove convergence rates 
when the exact solution is not sparse but in ^(N). Moreover, we will formulate the 
specific manifestations of solution smoothness in this case, also essentially based on the 
decay rate of solution components. On the other hand, we give a first example for an 
application of the variational inequality approach (see for details |B1 [TSJ [THJ Q22 [2D]) when 
neither source conditions nor approximate source conditions in the Banach space setting 
(cf. [27[ Section 3.2.2]) are available. The necessity of exploiting source conditions in the 
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course of constructing variational inequalities for obtaining convergence rates in Tikhonov 
regularization was up to now considered as a weakness of the approach. 

The paper is organized as follows: In Section 2 we will fix the basic problem setup, 
notations, and assumptions, and then proceed to an overview of smoothness conditions 
for proving convergence rates of variational regularizations in Section 3. In Section 4 we 
verify that non-sparse signals indeed fail to satisfy source conditions and even approximate 
source conditions. In Section 5 we derive a new variational inequality and use it to prove 
convergence rates for approximately sparse solutions. 



2 Problem Setting and Assumptions 

Let A G C(X, Y) be an injective and bounded linear operator mapping between the 
infinite dimensional Banach spaces X and Y with norms || • and || • ||y as well as with 
topological duals X* and Y*, respectively, where we assume that the range of A is not 

~ — Y 

closed. This means that we have 7Z(A) ^ 71(A) , which is equivalent to the fact that the 
inverse 

A' 1 : K(A) CF^I 
is an unbounded linear operator and that there is no constant < C < oo such that 

Pllx ^ C \\A%\W for all x G X. 

Moreover, we denote by {wfc}fcgN C X a bounded Schauder basis in X. This means that 
there is some K > such that 

\\u k y<K for all k G N (2.1) 
and any element iGl can be represented as an infinite series x = Yl x kUk convergent in 

k=l 

X with uniquely determined coefficients x^ G M. in the sense of lim ^2 XkUk\\x = 0. 

k=i 

In the sequel we always consider the coefficients x k , k = 1, 2, as components of an 
infinite real sequence x := (x\, X2, ■■■) and following the setting in [10] we assume that 
L : £ 1 (N) X is the synthesis operator defined as 

oo 

Li:=^a;(,!i t 6l for x = (xi,x 2 , ••■) G ^ 1 (N). 
fe=i 

Evidently, L is a well-defined, injective and, as one simply verifies, also bounded linear 
operator, i.e. L G £(^(N),X). 

As usual 

1/9 



MI^(N) := Yl 



1 



describe the norms in the Banach spaces £ 9 (N), 1 < q < oo, and ||£||^°(n) := sup \x k \ 

km 

the norm in The same norm ||x|| Co := sup \xk\ is used for the Banach space Cq of 

fcGN 



infinite sequences tending to zero. By £°(N) we denote the set of sparse sequences, where 
Xk 7^ only occurs for a finite number of components. 

In this paper, the focus is on elements x = Lx G X which correspond to sequences 
x G £ l (N) and we choose the Schauder basis {'UaJaign such that 

Tl{L) X = X. (2.2) 

When setting 

X:=f(N), A:=AoLe£(i\TH),Y), 

noting that A is also injective since A and L are, we are searching for stable approximations 
to elements x G £ l (N) satisfying the linear operator equation 

Ax = y, xeX, yeY, (2.3) 

in an approximate manner, where instead of the exact right-hand side y G 1Z(A) only 
noisy data y 5 EY satisfying the noise model 

\\y-y 5 \\y<S (2.4) 

with noise level 5 > are given. 

Proposition 2.1. Under the given setting including the conditions A2.1\) and \2. || ) the 

y 

linear operator equation Ii2.3\) is ill-posed, i.e., we have TZ(A) ^ 1Z(A) . 

Proof. By the continuity of A and by (1211 K(A) is dense in R(A). If 11(A) would be 
closed then we had 11(A) = 1Z(A) and hence 11(A) would be closed, too, which contradicts 
our assumptions. □ 



In the sequel, let (v*,v)b*xb denote the dual paring of an element v from the Ba- 
nach space B with an element v* from its dual space B*. Furthermore, we denote by 
e/c := (0, 0, 0, 1, 0, ...), with 1 at the k-th position for k = 1,2,..., the elements of the 
normalized canonical Schauder basis in £ 9 (N), 1 < q < oo, which is also a Schauder basis 

n 

in Co- That means, we find lim ||x — x k^k\\c — for all x = (x\,Xi,...) G Cq and 

n ^°° k=i 

n 

lim \\x — x k e k\U<i(N) — for all x = (xi, x 2 , ...) G £ q (N), 1 < q < oo, but not for 
q = oo. Moreover, we suppose that the following standing assumptions hold: 

Assumption 2.2. 

(a) The element x^ G £ X (N) solves the operator equation A2.3\) . 

(b) We have the limit condition lim ||Aejfe||y = 0. 

(c) For all k G N there exist fa &Y*, fky^ 1 0, such that e& = A* fa, i.e., we have 
x k = (ek,x)£oo^ xe i^ = (fa,Ax) Y *xY for all x = (x ly x 2 , ...) G f(N). 
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Remark 2.3. One simply sees that Assumption 12.21 is appropriate for the introduced 
model of the linear operator equation (12. 3p with injective forward operator A = A o L and 
its ill-posedness verified by Proposition 12.11 Firstly, by item (c) the operator 
A : £ l (N) — >■ Y is injective. Namely, we have x k = (e k , ^)^°°(n)x£ 1 (n) = for all G N 
whenever x = (x\, X2, ■■■) G £ 1 (N) and Ax = 0. This yields x = and hence the injectivity 
of A. From the injectivity of A, however, we have that x* from item (a) of Assump- 
tion [221 is the uniquely determined solution of ( 12.31) for given y G TZ(A). Secondly, from 
item (b) of Assumption 12.21 it follows that there is no constant < C < oo such that 
1 — Hefcll^w < C ||^ e A:||y f° r all k G N and hence we have that TZ{A) is a non-closed 
subset of the Banach space Y. Consequently, the linear operator equation (12.31) is ill- 
posed under Assumption 12.21 and the inverse operator A" 1 : 11(A) C Y ^(N), which 
exists due to the injectivity of A, is an unbounded linear operator. Hence, the stable 
approximate solution of ( 12. 3 fi based on noisy data y s G Y satisfying ( 12.4ft requires some 
kind of regularization. □ 

Note that by the closed range theorem the range TZ (A*) is a non-closed subset of 
£°°(N), but not dense in £°°(N), since it is a subset of Cq, as the following proposition 
indicates, and Cq is not dense in £°°(N) with respect to the supremum norm. 

'(N) 



Proposition 2.4. Under Assumption \2.2\ we have 1Z(A*) = cq. 

Proof. First we show that TZ(A*) C cq. For w G Y* we obvioulsy have A*w G £°°(N) and 

|[A*iu]fc| = |(A*w,efc)£oo (N)x ^i (N) | = \(w, Ae k ) Y *xY\ < \\w\W* \\Ae k \\ Y . 
Thus by Assumption E2 (b), A*w G c . 

It remains to show that each z G Co can be approximated by a sequence {A*u> n } ne 7v 
with w n G y*. For this purpose we define w n = ^ 2;?;/^ with from Assumption 12.21 (c) . 
Then 

oo 



/ J z k e k 

k=n+l 



\z — A*w n \\eoc^ 

3 (N) 

and therefore \\z — A*w n \\e°°(K) ~ > as n — > oo, since z G Cq. □ 

Remark 2.5. An inspection of the proof shows that for Proposition 12.41 the condition (b) 
of Assumption 12.21 can be weakened to 
(b 1 ) For all k G N we have weak convergence Ae k — in Y. 

Further, item (c) of Assumption 12.21 implies that 

N < \\fk\W* \\Ax\\ Y for all x G £\N) and k G N. (2.5) 

Applying this inequality to x = e k we obtain 1 < ||/fc||y* ||Aefc||y and therefore on the one 
hand 

IIMIr* > p| (2-6) 
and on the other hand by exploiting item (b) 

lim ||/ fc ||y. > lim 1 = oo. (2.7) 

k—^oo k—>oo \\/ie k \\y 
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If££°(N) denotes the weighted £°°-space with positive weights g = l/||/ 2 ||y*, ■■•) 

and norm 

ii n \ x *>\ 
\\x\\t rQX) := sup — — — , 

fceN H/fclly* 

from ( 12. 6p we have that 

H^|kg°(N) ^ ll^llll^H^CN) ^1 X ^ ^(N) 

and from (12. 5p that 

IMkg°(N) — \\Ax\W f° r a ll x e ^(N) 

Hence the norm in ££°(N) is weaker than the standard supremum norm in £°°(N) and 
we have < C < oo such that ||Ga;||^oo(Nj < C \\Ax\\y for all x G where G is the 

embedding operator from £ 1 (N) to ^°(N) the behaviour of which characterizes the nature 
of ill-posedness of problem (12. 3p . and we refer to |2"T| Remark 3.5] for similar considerations 
in £ 9 -regularization. We also mention that an assumption similar to (c) also appears in 
(U Assumption 4.1(a)]. □ 

Example 2.6 (diagonal operator). In order to get some more insight into details, we 
consider for a separable infinite dimensional Hilbert space X and a selected orthonormal 
basis {"UfcjfcgN i n X the compact linear operator A : X — > X with diagonal structure. 
That means we have Y = X and A possesses the singular system {o~k, Uk, Uk}keN such 
that for the decreasingly ordered sequence of positive singular values {o~k}keN, tending to 
zero as k — )■ oo, the equations Auk = o~k u k and A*Uk = o~kUk are valid for all k G N. 
For x = Yl XfcMfc we have the inner products (x, u^) % as square-summable components 

Xk in the infinite sequence x = (x\,X2, ■■■)■ Then the bounded linear synthesis operator 
L : X = £ l (N) X is the composition L = U o £ of the injective embedding operator £ 

£ 2 (N) 

from £ 1 (M) to £ 2 (N) with TZ(£) = £ 2 (N) and the unitary operator U, which character- 

izes the isometry between £ 2 (N) and X. Hence, 7Z(L) = X and both conditions (12. ip 
and (I2.2p are satisfied for that example. 

The injective linear operator A = Ao L : X — >■ F is as a composition of a bounded and 
a compact linear operator also compact and item (b) of Assumpt ion |2~21 is satisfied because 
of ||Aefc||y = ||Aefc||^ = = o~k — > as k — > oo. Since we have [A*itfc]fc = Ok and 

[A*Mfc] m = for m 7^ k as a consequence of A*Uk = <JkUk, fcGN, item (c) is fulfilled with 
fk = ^u k , k G N, and ||/fc||y* = ||/jfc||x = j- tends to infinity as k ->■ oo. n 

Our focus in on situations where we conjecture that the solutions of ( 12. 3p are sparse, 
i.e. x G £°(N), or at least that the coefficients Xk in x G £ X (M) are negligible for sufficiently 
large fceN. Then it makes sense to use the E 1 -regularization, and the regularized solutions 
x 5 a G £ X (N) are minimizers of the extremal problem 

T a (x) := -\\ Ax — y 5 \\y + a \\x\\iim) rain, subject to x G X = ^(N), (2.8) 
p 

of Tikhonov type with regularization parameters a > and a misfit norm exponent 
1 < p < oo. Then the sublinear and continuous penalty functional Q(x) := 
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possessing finite values for all x E X is convex and lower semi-continuous. Since X = £ X (N) 
is a non-reflexive Banach space we need some topology Tx in X which is weaker than the 
canonical weak topology in X in order to ensure stabilizing properties of the regularized 
solutions. In other words, f2 must be a stabilizing functional in the sense that the sublevel 

sets 

M Q (c) := {x E X : < c} 

are r^-sequentially precompact subsets of X for all c > 0. Since Z := Cq with Z* = £ l (N) 
is a separable predual Banach space of X, we can use the associated weak*-topology as 
Tx (cf. [27] Remark 4.9 and Lemma 4.10]). Note that Q under consideration here is also 
sequentially lower semi- continuous with respect to the weak*-topology. If the operator 
A can be proven to transform weak*-convergent sequences in £ l (N) to weakly convergent 
sequences in the Banach space Y, which will be done by Lemma [2771 below, then existence 
and stability of regularized solutions can be ensured. We refer for details to [T8l §3] and 
also to |6j HO] • Precisely, there even exist uniquely determined minimizers x a for all a > 
and y s E Y, because the Tikhonov functional T Q is strictly convex due to the injectivity of 
A. Moreover, the regularized solutions x s a are stable with respect to small data changes, 
and we have x 5 a E £°(N) for fixed a > 0. The last fact is proven in [2~Tl Lemma 2.1] if Y 
is a Hilbert space. For Banach spaces Y the proof remains the same if one observes that 
for each minimizer x 5 a there is some £ G such that 

£ G A*d (±|| • Vlly) (Ax s a ) C 1Z(A*) C c and - £ G ■ ||^ ( n))(4)- 

Lemma 2.7. Under the assumptions stated above the operator A : £ l (N) Y is always 
sequentially weak* -to-weak continuous, i.e., for a weakly convergent sequence x^ — ^* x 
in i 1 we have weak convergence as Ax^ Ax in Y . 



oo 

) 



Proof. Since the separable Banach space Cq, which has the same supremum norm like 
is a predual space of £ 1 , i.e. any element x = {x\, X2, ■■■) G C 1 is a bounded linear functional 
on Co, the weak*-convergence x^ ^* x in i 1 can be written as 

lim (x {n \g) e i XC0 = lim y^x^ ] g k = ^x k g k = (x,g) e i XCQ for all g = {g 1 ,g 2 ,...) G c . 

n— >oo n— >oo * • ' • 

fceN k&\ 

With the bounded linear operator A* : Y* — > £°° we can further conclude from 1Z(A*) C Cq, 
which follows from Proposition 12.41 , that A*f G Co for all f EY* and that 

(f,Ax^} Y * xY = (A*f,x^} eaa/ i = (x( n \A*f) elxco for all / G Y*. 

Hence we have 

lim {f,AxW) Y * Y = lim (x^,A*f) &Xco = (x,A*f) e i Xco = (f,Ax) Y * xY for all / G Y*, 
which yields the weak convergence in Y and completes the proof. 

□ 
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3 Manifestations of smoothness for convergence rates 



It is well-known that for linear ill-posed operator equations ()2.3p with A G C(X,Y), 

formulated in Banach spaces X and Y with some solution x* G X, convergence rates of 
regularized solutions 

E{x & a ,x ] ) = 0(ip(5)) as 5^0 (3.1) 

evaluated by some nonnegative error measure E and some index function if require addi- 
tional properties of x^ which express some kind of smoothness of the solution with respect 
to the forward operator A : X Y and its adjoint A* : Y* — » X*. We call a function 
ip : (0, oo) — » (0, oo) index function if it is continuous, strictly increasing and satisfies the 
limit condition lim ip(t) = 0. Moreover, we denote by xi, the minimizers of 

-\\Ax — y 5 \\Y + aQ(x) — >■ min, subject to x G X, 
V 

for 1 < p < oo and some convex stabilizing penalty functional f2 : X — > [0, oo]. Then the 
original form of smoothness is given by source conditions 

£ = A*w, w G Y\ (3.2) 

for subgradients £ G dQ^) C X*, and the error can be measured by the Bregman 
distance 

E(x s a , ar+) := Sl{x) - fi(x f ) -(£,x- x Vxx (3.3) 

as introduced in |5]. Then convergence rates (13. ip with <p(t) = t can be derived under 
appropriate choices of the regularization parameter a > (cf. [5j [THl 126]). 

If the subgradient £ G X* fails to satisfy (13. 2p . then one can use approximate source 
conditions and ask for the degree of violation of £ with respect to the benchmark source 
condition (13. 2p . This violation is, for example, expressed by properties of the strictly 
positive, convex and nonincreasing distance function 

dt(R):= inf \\£-A*w\\ x *. (3.4) 

If the limit condition 

lim dJR) = (3.5) 

R— >oo 

holds, then one can prove convergence rates (13 . 1 j) with E from (13. 3p and if depending 
on d% (cf. [31 [TB], fZT\ Chapter 3], and |191 Appendix A]). If, for example, the Bregman 
distance is g-coercive with q > 2 and 1/q+l/q* — 1, then we have 

<p{t) = [d, ($-\t))Y* , where *(R) := 

If, however, the distance function d^ does not satisfy (13. 5ft . this approach fails. As men- 
tioned in [3J such situation is only possible if the biadjoint operator A** : X** — > Y** 
mapping between the bidual spaces of X and Y is not injective. 

An alternative manifestation of solution smoothness is given by variational inequalities 
(cf., e.g., [T8| 120]). where in the sequel our focus will be on the variant formulated in 
Assumption 13. 1[ originally suggested in [HI ITT] . 
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Assumption 3.1. For given nonegative error measure E, convex stabilizing functional 
and x^ G X, let there exist a concave index function ip, a set MCI with x^ G AA and 
constants j3 > as well as C > such that 

(5E(x,x ] ) < Q(x) -fi(x f ) + C<p (\\A(x - x^)\\ Y ) for all x G M. (3.6) 

In the case of an appropriate choice of the regularization parameter a > the index 
function <p in the variational inequality (13. 6p immediately determines the convergence 
rate of Tikhonov- regularized solutions x s a to x^. Proofs of the assertions of the following 
proposition can be found in [T9| Theorem 1] and |8j Chapter 4]. We also refer to pQ. 

Proposition 3.2. Let the regularization parameter be chosen a priori as a = a(8) := 
or a posteriori as a = a(5,y s ) according to the strong discrepancy principle 

n5< \\Ax s a -y s \\ Y <r 2 5 (3.7) 

for prescribed constants 1 < T\ < T2 < oo. Then under Assumption \3.1\ we have the 
convergence rate ( Iff. 1\) whenever in both cases all regularized solutions x 5 a for sufficiently 
small 5 > belong to Ai. 

Remark 3.3. As was shown in |19| the strong discrepancy principle (13. 7p in Proposi- 
tion 13.21 can be replaced with the more traditional (sequential) discrepancy principle, 
where with sufficiently large cto > 0, < £ < 1, and A^ := {otj : otj := C^ a 0i 3 = 1)2,...} 
the regularization parameter a = a(5,y s ) is chosen as the largest parameter within A^ 
such that \\Ax a — y s \\y < t5 for prescribed r > 1. This, however, is more of interest if 
the forward operator is nonlinear and duality gaps can hinder regularization parameters 
a = a(5,y 5 ) to fulfil (I3.7P simultaneously with lower and upper bounds. □ 



4 Failure of approximate source conditions 

Now we come back to the situation X = £ l (N) of ^-regularization introduced in Section [2] 
and pose the following additional assumption. 

Assumption 4.1. Let the solution x^ = (x{,xl,...) G ^(N) of ( fff.ffp be non-sparse, i.e., 
x^ ^ £°(N) and hence there is an infinite subsequence {x\ n ^ 0}^ 1 of nonzero components 
of x^ . 



Lemma 4.2. Let Assumptions \2. <j and \4- 1\ hold and let £ G £°°(N) be an arbitrary element 
of X* = £°°(N). Then on the one hand, d^(R) — » for R — > oo if and only if £ G Cq. On 
the other hand, if s := limsup |£fc| > we have d%(R) > s for all R > 0. 

fc— >oo 

Proof. By definition of the distance function, d^(R) — )■ for R — > oo if and only if 

^°°(N) 

£ G 1Z(A*) . Hence the first part of the assertion is a consequence of Proposition 12.41 

£°°(N) 

where TZ(A*) = cq was proven. 

For proving the second part of the assertion we take a subsequence {£i„}neN with 
|6nk°°(N) — s as n — )■ oo and assume w G Y* with ||w||y* < R. Because of 

|[^.*w]i n | < IIHIHI^Jk < i?||Ae/ n ||y -)■ as n -)■ oo, 
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we see that ||£ — A*w||.goc,( N ) > sup neN — [A*w]/ n | > s. Thus, d^(R) > s for all R > 0. 
This completes the proof. □ 



Proposition 4.3. Under the Assumptions {KM and \4-l\ the benchmark source condition 



\3.S\) always fails. Also the method of approximate source conditions is not applicable, 
because we have for the corresponding distance function 

d/:(R)= inf A*w||*x, (N) > 1 for all R>0. (4.1) 

w£Y": ||u>||y* <R 

Proof. As is well-known the subgradients £ = (£1,^2, •••) G ^H^H^n) can be made explicit 
as 

1 if Xk > 0, 

U { e [-1,1] if x fc = 0, k EN. 

— 1 if Xk < 0, 

So we have £ ^ Co by Assumption 14. II Moreover, by this assumption we also have = 1 

for all nGN, that is limsup |£fc| = 1. Lemma I4T21 thus shows d%(R) > 1 for all R > 0. This 

k— ¥00 

means that d%(R) does not tend to zero as R — > 00 and, in particular, that £ 1Z(A*) 
as a direct consequence of £ ^ cq, since this would imply d^(R) = for sufficiently large 
#>(). □ 

Remark 4.4. Since (14. ip implies that d^(R) as R — > 00 and (13. 5p fails, this is 
an example for the case that the biadjoint operator A** is not injective although A is 
injective, where we have A** : (£°°(N))* — > Y** here. □ 



5 Derivation of a variational inequality 

As outlined in the previous section source conditions and even approximate source con- 
ditions do not hold if the searched for solution x^ G £ 1 (N) to equation (12. 3 p is not sparse. 
In the following we derive a variational inequality as introduced in Assumption 13.11 with 
M. = X = £ l (N) and /3 = 1 in the case of a non-sparse solution. By Proposition 13.21 we 
then directly obtain a convergence rate. Since the index function tp constructed below 
is not explicitly available for choosing a > 0, a posteriori choices of the regularization 
parameter are to be preferred, and in particular the strong discrepancy principle (13. 7h 
ensures the convergence rate 

\\x s a -xt\\ e i m = 0{<p{8)) as 5^0. 

We need the following lemma. 
Lemma 5.1. For all x G £ l (N) and all n G N the inequality 

too n 
k=n+l k=l 

is true. 
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Proof. For n G N define the projection P n : £ X (N) — > ^(N) by P n x := (xi, . . . , x n , 0, . . .). 
Then obviously 

iFll^p*) = ll-Pn^l^i(N) + — Pn)x\\tl(N) 

for all x G Based on this equality we see that 

||x — £'|||i(N) — ||x||^i(N) + ||ar ||^i( N ) 

= \\P n (x -X^H/ifN) + ~ Pn)x j \\ii^) + - P n )(x - X ] )\\ e i(S) 
~ ~ Pn) x \\e^CM) + ll-Pn^ll^W ~~ 1 1 -f txSP | U 1 (N) - 

Consequently, together with the triangle inequalities 

- P n )(x - x"<)\\ e i {n) < ||(/-P n )x||^i (N) + - P n )xm £ i m 

and 

HPn^ll^CN) < \\Pn(x - X^)\\ei(n) + ||Pn^||£i(N) 

we obtain 

\\X - X j \\ E i {N) - \\x\\ e i {N) + H^H^N) < 2 (\\P n {x - ^ f )||^(N) + ~ P n )x f \\ e i(N)) ■ 

□ 

Theorem 5.2. The variational inequality ( 13. 6ft holds true with (3 = 1, M. = X = 
E(x,x') = \\x — x'\\£i(jfj, Q(x) = ||x||£i(N) ; and 



\k=n+l k=l 

The function if is a concave index function. 
Proof. By Assumption 12.2( c) we have 



n n 



x - X^)\\ Y 



E \ Xk " X k\ = E^'^ _ Z f )^(N)x^(N) < E ll/ fc H y *H yl ( 
k=l k=l k=l 

for all n G N and with the help of Lemma 15.11 we obtain 

too 
E \ x \\ + \\ A i x - xf )\\Y^\\fk\\Y* 
k=n+l k=l 

for all n G N. Taking the infimum over n on the right-hand side yields the desired 
variational inequality. 

It remains to show that ip is a concave index function. As an infimum of afline functions 
if is concave and upper semi-continuous. Since (p(t) < oo for t G [0, oo), <p is even 
continuous on (0, oo). Together with <p(0) = the upper semi-continuity in t = yields 
(p(t) — > if t — > +0, that is, continuity in t — 0. To show that y> is strictly increasing 
take ti,t 2 G [0, oo) with t\ < t 2 . Since by (12. 7p the infimum in the definition of (pit-i) is 
attained at some n 2 G N, we obtain 

(OO 712 \ 

E i4i+*iEii/*iH <^)- 
fc=n 2 +l fe=l / 

□ 
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Example 5.3 (Holder rates). As one sees the rate function ip in Theorem 15.21 depends on 
decay properties of the solution components |xt| for k — > oo and on the growth properties 

n 

of the finite sums ||A||y* f° r n ~~ 00 • If decay and growth are of monomial type as 

k=l 

oo n 

\x\\<K x n-^ ^||/ fc ||y* <^ 2 ^, /^>0, (5.2) 

k=n+l k=l 

with some constants Ki,K 2 > 0, then we have from Theorem 15.21 (see formula (15. ip ) 
and Proposition 13.21 that the strong discrepancy principle for choosing the regularization 
parameter a > yields the Holder convergence rates 

\\x s a -x j \\ e i m = O as 5->0. (5.3) 

Whenever exponents p, > 1, P > 0, and constants Ki, K 2 > exist such that 

\xl\<K 1 n-' x , \\fk\W* < K 2 n\ for all fc£N, 
the rate result (15.31) can be rewritten as 

\\ x a ~ ^II^W = O \8T^\ as (5^0. 

In particular, for the compact operator A in the diagonal case of Example 12. 6[ the expo- 
nent v > can be interpreted as the degree of ill-posedness expressed by the operator A 
if the decay rate of the singular values of A is of the form ~ k~ v . 

Remark 5.4. If x' is sparse and if n G N is the largest n for which x^ 7^ 0, then the 
theorem yields a variational inequality with 

^<2&\\fk\\Y*jt. 

Consequently the (9-constant in the corresponding convergence rate \\x s a — x^ W^m) — 
depends on the size of the support of x'. □ 

Theorem 15.21 yields a function (p and a corresponding variational inequality (13. 6 p for 
arbitrary x^ G X = £ 1 (N). But as a consequence of Assumption 12.21 (c) the proof of 
Theorem 15.21 only works for injective operators A. The converse assertion would be that 
if for each x G X there is some index function (p such that the variational inequality 

\\x - x\\ e i^ < \\x\\ e i^) - \\x\\ii(n) + ^(\\A(x - x)\\ Y ) for all x G X 

is true, then A is injective. The following proposition states that even the existence of 
only one such x is sufficient for injectivity of A. 

Proposition 5.5. If there is some x G £ 1 (N) with Xk 7^ for all k G N such that the 
variational inequality 

f3\\x - x||^i( N ) < ||x|^i(N) - ||x||p( N ) + </?(||A(:c - x)\\ Y ) for all x G t(K) (5.4) 

is satisfied with (3 > and an arbitrary index function ip, then A is injective. 
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Proof. For x G M{A) \ {0}, where N(A) denotes the null-space of A, the variational 
inequality (15.41) applied to x + x yields 

h(x) := f3\\x\\ e i iN) - \\x + x||^i( N) + ||x||^i( N ) < ^(||A:e||y) = 0. 

If there is some t > such that h(tx) > we have a contradiction to tx G J\f{A). Thus, 
we can assume h(tx) < for all t > 0. 

Define two index sets 

h(t) := {k£N:t\x k \ < \x k \}, 
I 2 (t) ■ = {k G N : t\x k \ > \x k \}, 



where t > 0. Simple calculations then show that 

\x k tXfc | ~\~ \x k ~\~ tXfc I — 

for all k G N. Thus we have for all t > 



2\x k \, he hit), 
2t\x k \, k G I 2 (t), 



h(—tx) = 2/3||tx||£i( N ) — \\x — tx||£i( N ) — \\x + + 2||x||£i( N ) — h(tx) 

> 2/3\\tx\\ £ i^ - \\x - tx|| £ i (N ) - ||3 + tx\\ e i^) + 2p||^( N ) 



2 [ \\ x \U i (w) - ' Xfc l +2 M ^ll x " £1 ( N ) ~ 

feeii(t) / \ fcei 2 (t) 



Now choose n G N such that there is some ko G {1, . . . , n} with x ko ^ and such that 

oo 

^2 \ x k\ < PWxWiiQf). 



k=n+l 

Set 

- — - : k G {1, . . . , n}, x k ^ 
Then 

oo 

\x k \< < ^IMI^w 

k£l 2 (t) k=n+l 

and therefore h(—tx) > 0, which contradicts — tx G A/" (A). 

In all cases the assumption x G M{A) \ {0} led to a contradiction. Thus, N(A) = {0} 
is proven. □ 



6 Conclusion 

We have shown that, in the case of £ 1 -regularization, variational inqualities can signifi- 
cantly increase the range of solutions for which convergence rates can be shown compared 
to source conditions. Of course, the results are still preliminary since they rely on the 
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injectivity of the operator A, which is an unwanted feature in many setups, in particu- 
lar if motivated by compressed sensing. However, it provides an interesting insight into 
the current borders of regularization theory and a strong motivation to study variational 
inequality approaches in particular for singular regularization functionals. 

Thinking about potential extensions of the approach and weakening of the assumptions 
one observes that currently several steps are based on the choice of "minimal" subgradi- 
ents, i.e. the entries of the subgradient are set to zero outside the support of the solution. 
From a source condition perspective, it can be seen as the assumption that for all one- 
sparse solutions a very strong source condition is satisfied by the minimal subgradient. A 
feasible approach to extend the results of this paper might be to consider larger classes of 
subgradients, whose absolute value should however be bounded away from one or decay 
to zero outside the support. The exact type of needed condition remains to be determined 
in the future. 
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